- Introduction to the project
- Project overview
- Methods used
- Results
- Discussion
- Conclusion
Spring 2020
Deep mutational scanning allows for assessing the functional consequences of up to hundreds of thousands of variants of a protein in a single experiment.
It combines high-throughput DNA sequencing with a selection in which a physical association is maintained between each protein variant and the DNA that encodes it.
The sequence analysis provides the frequency of each variant in an input population and in a population after selection, with this ratio serving as a proxy for the function of each variant.
| Protein | Target | Biological activity | Species | Num of variants | Score | |
|---|---|---|---|---|---|---|
| Data set 1 | BRCA1 | BARD1 RING domain | Ubiquitin E3 activity | H. sapiens | 5610 | Y2H assays |
| Data set 2 | ERK2 | Small molecule (SCH772984) | Resistance to drugs | H. sapiens | 6810 | Drug sensitivity assays. Calculation of cell availability |
| Data set 3 | LDLRAP1 | OBFC1 | Protein translation | H. sapiens | 6385 | Y2H assays |
| Data set 4 | Pab1 | el4FG1 | Translation initiation | S. cereviseae | 1340 | Y2H assays |
| Function | Library |
|---|---|
| Data loading | readxl |
| Data cleaning and wrangling | dplyr , broom (tidyverse) |
| Data augmenting | dplyr (tidyverse),Peptides |
| Extracting data | UniprotR |
| Plotting | ggplot2(tidyverse), ggseqlogo,ggpubr |
| Analysing | stats |
| Modeling | keras,neuralnet, caret, yardstick, glmnet,ANN2 |
An amino acid was counted as active if score >0.
The scoring scale was truncated to between -2ā¦2.
The score was normalized and summed to 1 for each postion.
Supported machine learning framework:
| Scale(s) | Description of scale |
|---|---|
| blosum45,50,62,80,90 | Substitution matrix based on VARIMAX analysis of physicochemical properties |
| pam30,70,250 | Substitution matrix based on observed mutations in phylogenetic trees |
| z5_scales | PCA of physicochemical properties |
Amino acid substition matrices: https://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/data/
Z5-scales: https://pubs.acs.org/doi/10.1021/jm9700575